作者:Jianhui Liu Yukang Chen Xiaoqing Ye Xiaojuan Qi
类别级6D姿态估计旨在预测特定类别中未被发现物体的姿态和大小。由于先验变形将特定类别的3D先验(即3D模板)明确地适应于给定的对象实例,基于先验的方法获得了巨大的成功,并已成为主要的研究流。然而,获得特定类别的先验需要收集大量的3D模型,这是耗时的,而且在实践中往往无法获得。这促使我们研究先验是否有必要使基于先验的方法有效。我们的实证研究表明,3D先验本身并不是高性能的功劳。实际上,关键点是显式变形过程,它将世界空间3D模型(也称为规范空间)监督的相机和世界坐标对齐。受这些观察的启发,我们引入了一个简单的先验自由隐式空间变换网络,即IST-Net,将相机空间特征变换到世界空间
Category-level 6D pose estimation aims to predict the poses and sizes ofunseen objects from a specific category. Thanks to prior deformation, whichexplicitly adapts a category-specific 3D prior (i.e., a 3D template) to a givenobject instance, prior-based methods attained great success and have become amajor research stream. However, obtaining category-specific priors requirescollecting a large amount of 3D models, which is labor-consuming and often notaccessible in practice. This motivates us to investigate whether priors arenecessary to make prior-based methods effective. Our empirical study shows thatthe 3D prior itself is not the credit to the high performance. The keypointactually is the explicit deformation process, which aligns camera and worldcoordinates supervised by world-space 3D models (also called canonical space).Inspired by these observation, we introduce a simple prior-free implicit spacetransformation network, namely IST-Net, to transform camera-space features toworld-space counterparts and build correspondence between them in an implicitmanner without relying on 3D priors. Besides, we design camera- and world-spaceenhancers to enrich the features with pose-sensitive information andgeometrical constraints, respectively. Albeit simple, IST-Net becomes the firstprior-free method that achieves state-of-the-art performance, with topinference speed on the REAL275 dataset. Our code and models will be publiclyavailable.
论文链接:http://arxiv.org/pdf/2303.13479v1
更多计算机论文:http://cspaper.cn/